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Abstract 

Background: Cryptic genetic variation (CGV) is considered to facilitate phenotypic evolution by producing visible 
variations in response to changes in the internal and/or external environnnent. Several nnechanisms enabling the 
accunnulation and release of CGVs have been proposed. In this study, we focused on gene regulatory networks (GRNs) 
as an innportant nnechanism for producing CGVs, and exannined how interactions between GRNs and the 
environment influence the nunnber of CGVs by using individual-based sinnulations. 

Results: Populations of GRNs were allowed to evolve under various stabilizing selections, and we then measured the 
number of genetic and phenotypic variations that had arisen. Our results showed that CGVs were not depleted 
irrespective of the strength of the stabilizing selection for each phenotype, whereas the visible fraction of genetic 
variation in a population decreased with increasing strength of selection. On the other hand, increasing the number 
of different environments that individuals encountered within their lifetime (i.e., entailing plastic responses to multiple 
environments) suppressed the accumulation of CGVs, whereas the GRNs with more genes and interactions were 
favored in such heterogeneous environments. 

Conclusions: Given the findings that the number of CGVs in a population was largely determined by the size (order) 
of GRNs, we propose that expansion of GRNs and adaptation to novel environments are mutually facilitating and 
sustainable sources of evolvability and hence the origins of biological diversity and complexity. 



Background 

The speed and direction of phenotypic evolution depends 
on the quantity and quality of genetic variations [1,2], 
and there have been a number of attempts to quantify 
such variations [2,3]. Understanding how biological pop- 
ulations generate and maintain genetic variations that 
contribute to phenotypic variations is one of the most 
important issues in evolutionary biology. When pheno- 
typic variations are caused by additive genetic varia- 
tions, evolutionary responses to directional selection are 
expected to be readily halted because the genetic variation 
is exhausted in the process of selection [4,5]. However, 
several experiments involving artificial selection have 
shown that responses to directional selection can continue 
for many generations [6,7]. In addition, a population of 
organisms sometimes shows much larger phenotypic vari- 
ance when it encounters novel environments compared 
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with that when it lives in normal environments. Stand- 
ing genetic variations that do not translate into pheno- 
typic differences in the current genetic and environmental 
background but that can become visible in a different 
background are called cryptic genetic variations (CGVs). 
They are considered to contribute to evolution by gener- 
ating phenotypic diversity in response to changes in the 
environment and genetic background [8-11]. 

Various mechanisms are involved in preventing genetic 
variations from manifesting as phenotypic variations. The 
ability to retain phenotypes and functions despite internal 
and external perturbations is called biological robustness 
[12]. The molecular chaperone called HSP90 is a well- 
known example of such a mechanism [13,14]. It neutral- 
izes non- synonymous substitutions on DNA sequences by 
assisting the proper folding of polypeptide chains. Defects 
in HSP90 or environmental stresses beyond its capacity 
cause the emergence of hidden variations. A system that 
contributes to evolvability by hiding and releasing CGVs 
is referred to as an evolutionary capacitor [13,15,16]. In 
addition, many genes other than HSP90 are suggested to 
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be involved in stabilizing developmental processes and 
regulating CGVs [14,17]. Apart form the mechanisms 
involving a molecular chaperone, redundancy and modu- 
larity also produce robustness in organisms [18]. 

One study [19] proposed a model whereby the release of 
hidden genetic variation due to a change in the environ- 
ment or genetic background can be explained by epistasis 
or genotype-environment interactions. In fact, epistatic 
interactions were found to be involved in a number 
of directional selection experiments [20]. Recent studies 
have also revealed how genes interact with each other dur- 
ing early development (in multi-cellular organisms) and 
in response to environmental stimuli (in microorganisms) 
[21-23]. It has recently been suggested that a phenotype 
is not only the sum of gene effects but also the prod- 
uct of complex interactions among genes [24-26]. Such 
networks of gene interactions are termed gene regula- 
tory networks (GRNs), and these are considered to play 
crucial roles in cell differentiation and specific biological 
functions by modulating the expression of different gene 
combinations as well as the extent, site, and time of gene 
expression [27,28]. Epistasis and genotype-environment 
interactions are generic features of GRNs [29,30], and the 
biological properties of networks of interacting genes are 
thus fundamental factors that produce CGVs [29-33] . 

To understand the nature and the evolution of CGVs, we 
need to explore the structure and mechanisms of GRNs 
that accumulate hidden variations and release these vari- 
ations upon exposure to novel environmental stimuli. We 
also need to determine how GRNs themselves change dur- 
ing evolutionary processes and to explore the conditions 
under which evolved GRNs can produce CGVs. 

CGVs should be accumulated through population 
genetic processes, such as mutations, genetic drift, and 
natural selection. However, most studies of the robust- 
ness of GRNs [34,35] have not explicitly assumed gene 
frequency changes in populations, and in particular, no 
attempt has been made to examine how CGVs could be 
accumulated through GRNs that have evolved via popula- 
tion genetic processes even in studies that have discussed 
the evolution of robustness due to changing gene frequen- 
cies [36,37]. To understand the role of GRNs in cryptic 
variation, we need to construct models that include GRNs 
that produce different phenotypes when exposed to differ- 
ent environmental stimuli and to examine the number of 
CGVs acquired through genetic drift and natural selection 
for each phenotype produced by these GRNs. 

This study aimed to show how interactions between 
GRNs and the environment influence the accumulation 
and emergence of CGVs. For this purpose, we constructed 
an individual-based model comprising individuals with 
plastic GRNs, i.e., GRNs that produce different pheno- 
types in response to different environmental stimuli. We 
then determined which factors, e.g., strength of selection. 



the mode of mutations, the size and properties of GRNs, 
and the number of environments that the organisms 
encountered, affected the number of CGVs. 

Methods 

Outline of the model 

We constructed an individual-based model similar to 
that of Tsuda and Kawata [36]. The model population 
comprised unicellular haploids that reproduced asexu- 
ally. Individuals had their own genomes, and the genome 
of an individual determined the structure of that indi- 
viduals GRN. The GRN represented a single regulatory 
module that controlled gene expression in response to 
environmental stimuli (Figure 1). The phenotype of each 
individual was defined as the combination of steady- 
state expression levels of phenotypic genes induced by 
upstream regulatory genes. The initial GRN of each 
individual had a random structure. We assumed that 
the organisms encountered a number of different envi- 
ronments within their lifetime before reproduction and 
that they plastically produced different phenotypes in 
response to different environmental stimuli. Physical and 
biological stimuli such as temperature, light, nutrients, 
and proteins were assumed to produce trans -elements 
that initiated a gene regulatory cascade. The parameters 
for GRNs were essentially chosen from values observed in 
real organisms as well as from values used in other studies 
[23,36,38-40]. 

The lifetime fitness of an individual was measured as 
the geometric mean of the fitness of the phenotypes for 




Phenotype = Expression levels 



Figure 1 A schematic example of a gene regulatory network 
model. The squares and diamonds represent cis- and trans-elements, 
respectively. For simplicity of explanation, only a few of the numbers 
used for mutual recognition between these elements are shown; 
however, each element had a number. Black arrow-headed lines and 
gray bar-headed lines represent transcriptional activation and 
repression, respectively, and the line weight denotes the intensity of 
the interaction. Note that different genes would be core genes in 
different environments. 
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different environments, and the fitness for each environ- 
ment was calculated on the basis of a fitness function 
for the phenotype in each environment. The individu- 
als reproduced according to their lifetime fitness values. 
When an offspring was produced, mutations {cis- and 
^ra«5-regulatory mutations) were expected to occur. Gene 
duplications and deletions were not assumed in this 
model. 

A population of individuals was allowed to evolve over 
a number of generations until a mutation-drift balance 
was achieved. The genetic and phenotypic diversity and 
the network characteristics of these were then measured. 
In addition, phenotypic diversity was measured when the 
individuals were exposed to novel environmental signals, 
i.e., signals that differed from those found in the normal 
environments during the evolution of the population. 

GRNs 

The individuals had Vgenome (20 by default) genes, which 
together constituted a GRN. Genes at the downstream ter- 
minals, which are known as differentiation gene batteries, 
have been suggested to represent genes expressed in a cell 
that fulfill a specific function [23,41]. We assumed that 
four of the Vgenome genes were phenotypic genes that did 
not have any control over the other genes and that the 
rest were regulatory genes. Phenotypic genes assigned by 
us were twice the genes assigned in a previous study [36] 
to enable us to apply distinct pressures upon phenotypes 
in different environments. Each gene had a c/5-regulatory 
region and a coding region. The coding region of each reg- 
ulatory gene produced a transcription factor when it was 
expressed. Each c/5-regulatory region was composed of L 
(20 by default) c/5-elements that were potentially recog- 
nized by specific transcription factors or signals induced 
by environmental stimuli. Each c/5-element and transcrip- 
tion factors had a recognition number (ads and atrans> an 
integral number between 1 and 200). Each transcription 
factor bound to the c/5-elements with the corresponding 
number and induced the expression of the gene. A default 
value of L and a range of recognition numbers were cho- 
sen so that the expected number of target genes for a 
regulatory gene {L x Vgenome f^^^) was 2.0 [40]. Each cis- 
element and transcription factor also had an interaction 
coefficient {vds and Vtmns^ a real number between —1 and 
1) that affected the strength of the transcriptional acti- 
vation/repression. The absolute value of a coefficient was 
chosen from uniform random numbers between 0 and 1, 
and its sign was positive with the fixed probability so 
that the expected proportion of positive edges in a GRN 
became A. We refer to A as the positive edge bias and set 
its default value to 0.6 given the observation that 60% of 
the edges in the bacteria GRN are positive [39]. In addition 
to the transcription factors, which were the products of 
the coding regions, we assumed that each environmental 



signal was another type of trans -element bound to the 
c/5-elements, and we set the interaction coefficient to +1.0. 

The strength of transcriptional activation/repression 
was calculated from the interaction coefficients of both 
the trans- and c/5-element values {vcis> Vtrans) as well 
as from the concentration of the trans-element (ptmns)- 
Assuming an additive effect of each element, the intensity 
of the regulatory input to gene /, Xi, was calculated as the 
sum of the effects of all of the trans-elements that bound 
to the c/5-elements of gene /. 



/ ^ / ^ ^cistransWcis^translptrans 



(1) 



cis trans 



^cis trans — 



1 {acis = atrans H {Vds > 0 H Vtrans > 0)) 
0 (^d5 7^ ^ trans) 

-1 {ads = atrans H {Vds < 0 U Vtrans < 0)) 



(2) 



The transcription efficiency of a gene is known to show 
an S-shaped response to the intensity of the input to the 
gene and can usually be approximated by the Hill function 
with a Hill coefficient ranging from 1 to 4 [39]. We used a 
Hill function with a Hill coefficient of 2 as a transcription 
function. Thus, the transcription rate Pi of the gene / was 
given as follows: 



I^+^W/(^>0) 

fibasal (X < 0) 



(3) 



where Pmax and are constants that determine the 
maximum transcription rate and the threshold against 
regulatory input; both were set to 0.1 so that the nor- 
malized expression level ranged from 0.0 to 1.0 and was 
balanced with the degradation rate a (described below). 
We assumed that the genes without transcriptional regu- 
lation {x = 0) were also expressed to some extent; there- 
fore, the basal transcription rate Pbasal was set to 0.001, 
which is consistent with the previous study by Tsuda and 
Kawata[36]. However, this parameter may not have greatly 
affected the phenotypes observed in this study because we 
obtained quite similar results when Ptasal = 0. 

The concentration of a transcript was determined by 
the balance of its production and degradation. The rate of 
change in the concentration pi of the transcripts of gene / 
was given as follows: 



dpi 
dt 



= Pi(Xi) - a Pi 



(4) 



where is a constant that determines the degradation rate 
of transcripts. This was set to 0.1 because degradation 
rates are generally slower than transcription rates [42]. 
The following recurrence equation, which was obtained as 
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an approximation of the previous equation by employing 
Euler s method, was used in the simulation: 

Pi,t-^dt = Pi,t (^) 

where the time step dt was set to 1.0 in the numerical cal- 
culations. We regarded the expression of the gene to be 
constant when the variance of the expression level in 32 
recent loops was < 10~^. These values were determined 
for computational convenience. Individuals received input 
signals from environmental stimuli through a signaling 
pathway that functioned via ^ra«5-elements that acti- 
vated the expression of genes with the corresponding 
c/5-elements. In this model, each ^ra«5-element from an 
environmental stimulus also had a random ^ra«5-number 
{atrans> ^n integral number between 1 and 200, the same as 
the transcription factors described above) and a constant 
interaction coefficient {vtrans> +1.0). 

All the mutations in our simulation were point muta- 
tions on cis- or ^ra«5-elements, and they could be 
categorized into four types of changes: c/5-number, cis- 
coefficient, ^raws-number, and tmns-coefficient Each 
type had the same mutation rate (/x), and the default value 
was set to 10""^ per individual per generation, which was 
in balance with the population size N (described below). 

Phenotype and fitness 

The level of gene expression reached a stable state after the 
GRN was activated by environmental signals. The pheno- 
type of an individual was defined as the combination of 
the stable-state expression levels of four phenotypic genes. 
The phenotypic value of an individual was represented by 
a point in four-dimensional space with four axes, each of 
which denoted the expression level of a phenotypic gene. 
The individuals whose expression levels were very low 
(< 0.01) or did not reach stable states after 5,000 loops of 
interaction were assumed to be dead. These values were 
determined for computational convenience. 

The fitness of an individual was defined as a decreasing 
function of the distance from the optimal expression lev- 
els, which differed in different environments. The fitness 
of each individual / in environment / was given as follows: 

0,-,y = exp(-^) (6) 

where di^j is the Euclidean distance from the optimum 
defined by the environment / and the phenotypic value 
of the individual /, and a is the parameter that deter- 
mined the breadth of the foot of the fitness landscape. For 
simplicity of explanation, we used S = —ln2cr^ to repre- 
sent the strength of selection and set it to 3.0 by default. 
When S was larger, the selection gradient became steeper 
and small changes in phenotypic values were subjected to 



stronger selective pressure, i.e., the mutations that pro- 
duced phenotypic variations were supposed to be purged 
more quickly from the population. 

An environmental stimulus induced signals as trans- 
elements in all the individuals in a population, and the 
corresponding optimal value of the phenotypes was given 
in the environment. Individuals experienced H (two by 
default) different environments during their lifetime. The 
lifetime fitness of an individual was given by the geo- 
metric mean of the fitness in all of the environments. 
We assumed non-overlapping generations and a constant 
population size, N (100,000 by default). Therefore, the 
default value of N x fi was 40 per generation, which is 
consistent with the estimated values for bacteria and yeast 
[38]. The expected number of offspring, of the indi- 
vidual / whose lifetime fitness was was calculated as 
follows: 



Simulation procedure 

A population of individuals with random genomes was 
created and allowed to evolve under normal conditions. 
The normal conditions consisted of H different environ- 
ments, and all the individuals experienced all the H envi- 
ronments during their lifetime. Each environment had 
different stimuli-inducing signals and corresponding opti- 
mal phenotypic values. Individuals could evolve to adapt 
to the normal conditions and increase their lifetime fit- 
ness over 20,000 generations. When the lifetime fitness of 
some individuals in the population exceeded 0.99, a clone 
population was created by copying the individual with the 
highest fitness. Thus, the new initial population harbored 
only the individuals that were well adapted to the nor- 
mal conditions. This clone population was then allowed to 
evolve under the same normal conditions for 20,000 gen- 
erations. The number of standing genetic variations (G) 
in the population could reach a stable state during this 
period. This was measured as the number of polymor- 
phic GRNs in the population, i.e., genetic variations that 
caused no changes in gene interactions were not included 
in the count. To evaluate the phenotypic diversity of a pop- 
ulation, we divided the phenotypic space into grid cells 
(Figure 2). The phenotype of each individual could be 
placed in one of these cells. The number of cells in which 
the phenotypic values of the individuals in the population 
were placed was defined as the phenotypic diversity of the 
population. At the end of the simulation, the phenotypic 
diversity of the population in one of the environments 
under normal conditions (Pnormal) was measured. There- 
after, the population was subjected to a novel environment 
with a novel stimulus and the phenotypic diversity (Pnovel) 
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Expression level of phenotypic gene A 

Figure 2 A schematic example of evaluating phenotypic 
diversity. For simplicity, a two-dimensional space is shown; however, 
the phenotypic space used in our analysis was four-dimensional. The 
two axes denote the expression levels of phenotypic genes 1 and 2, 
respectively. The phenotype of an individual gives the coordinate of a 
point in this space. The space was divided into grid cells. The 
phenotypic diversity of a population was defined as the number of 
cells in which the individuals in the population were found. The 
circles and triangles denote the individuals' phenotypes in normal 
and novel environments, respectively. In this example, the 
phenotypic diversity of the population was four in the normal 
environment (dark gray cells) and 1 3 in the novel environment (light 
gray cells and some dark gray cells). In this case, the number of cryptic 
genetic variations was calculated to be nine (i.e., 13-4). 



was again measured. If Pnovel larger than Pnormah 
CGVs were expected to present. The number of CGVs 

{Pcryptic) was thuS measured as Pnovd Pnormah 



measured as the average values of all the individuals in a 
population and 120 replicated populations under the same 
parameter set (default parameter set, Table 1) were used 
for the analysis. All statistical analyses were performed 
using R 2.15.2 software [43]. We used a generalized lin- 
ear model (GLM) for which the response and explanatory 
variables were the number of CGVs {Pcryptic) and GRN 
features, respectively. We extracted the best predictors 
with nested model comparisons by calculating the Akaike 
Information Criterion (AIC); explanatory variables were 
removed and added to the models to determine the set of 
predictors that yielded the lowest AIC. 

The number of genes {Vgenome)^ the number of cis- 
regulatory elements per gene (L), and positive edge ratio 
{A) were set as the initial parameter values given their 
effects on network properties such as order, weighted size, 
and density. The population size (A/"), mutation rate (/x), 
strength of stabilizing selection {S), and number of envi- 
ronments (//) were also set as the initial parameter values. 
The effects of each of these properties on the number 
of CGVs were examined by modifying them. The values 
for these parameters are shown in Table 1. Simulations 
were conducted 20 times for each parameter. The num- 
ber of standing genetic variations (G), visible phenotypic 
variations {P normal) y ai^d cryptic variations {Pcryptic) was 
measured after the simulations (see the previous subsec- 
tion for their definitions). We obtained regression coef- 
ficients from a GLM in which G, Pnormah or Pcryptic was 
the response variable, and each of the parameters was an 
explanatory variable. 

Network properties 

The structural properties of biological networks are con- 
sidered to be responsible for biological processes such as 
environmental response and cell differentiation [44,45]. 
The following network properties were measured using 
NetworkX 1.7, a Python language software package [46]. 



Analysis of factors affecting the number of cryptic 
variations 

To determine which factors associated with the GRNs 
affected the number of CGVs, the following network prop- 
erties were analyzed: network order, network size, density, 
average clustering coefficient, number of self-loops, and 
degree assortativity. The methods used for calculating 
these properties are described in the next section. In 
addition, the number of standing genetic variations in a 
population (G, previously defined) was also included as a 
properties that affected the number of CGVs. These net- 
work properties could not be fixed as parameter values 
in the simulations because they were randomly varied at 
the initial states and were changed by evolution during the 
simulation runs. Thus, they varied among the simulations 
even with the same parameter values. The properties were 



Network order 

Gene duplication was not assumed in our model, and 
the number of genes that an individual had was deter- 
mined by a parameter {Vgenome)- However, not all the 
genes were responsible for phenotypic expression in our 
model because how the genes interacted with each other 
depended on the randomly variable c/5-regulatory ele- 
ments of the genes. Thus, we defined core genes as the 
genes in a GRN that were connected to phenotypic genes 
and the network order {Vcore) as the number of core genes. 

Network size and density 

Genes and the regulatory interactions among them were 
represented as nodes and edges, respectively. The network 
size (£) was defined as the number of edges among the 
core genes. Similarly, the weighted size of the network 
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Table 1 Range and default parameters used in the simulations 



Symbol 


Variable 


Min. 


Max. 


Default 




Mutation rate [gene~^ generation"^] 


5.0 X 10-5 


1.5 X 10-4 


1.0 X 10-4 


N 


Population size 


50,000 


150,000 


100,000 


^genome 


Number of genes 


10 


30 


20 


L 


Number of c/s-regulatory elements 


10 


30 


20 


A 


Positive edge bias 


0.3 


0.6 


0.9 


H 


Number of environments 


1 


6 


2 


S 


Strength of selection 


1.0 


6.0 


3.0 



{E^eighted) was defined as the sum of the weight of all the 
edges where the weight of an edge was the strength of the 
regulatory interaction. The density of a network denotes 
the ratio of the number of existing edges to the maximum 
possible number of edges among the core genes and was 
calculated as follows: 



density = 



V2 
^ core 



(8) 



If the two genes showed a regulatory interaction, we 
regarded them as being neighbors. The number of neigh- 
bors of gene / was defined as degree hi. In our model, the 
expected density values were dependent on the number of 
genes and c/5-regulatory elements of each gene. 

Average clustering coefficient 

The clustering coefficient Q of gene / was given as the 
ratio of the actual number of edges ei between the neigh- 
bors relative to the maximal number. We calculated the 
average clustering coefficient C of all the core genes as a 
network property as follows: 



Ci = 



2ei 



ki(ki - 1) 



C 



(9) 



(10) 



The average clustering coefficient of a network tends 
to increase when the network has more locally dense 
compartments, i.e., clusters. Cancerous networks of onco- 
genes in humans are known to form a giant, highly 
clustered component in the whole signaling network [47]. 

Number of self-loops 

When a gene product regulates its own expression, it is a 
form of autoregulation and is denoted by a loop in a net- 
work called a self-loop. It is known that negative feedback 
stabilizes its expression level and shortens the response 
time, while positive feedback delays the response and 
results in bistability [39]. A mutagenesis experiment with 



yeast confirmed that genes with negative feedback loops 
are mutationally robust [48]. 

Degree assortativity 

A network is said to be assortative (or disassortative) 
when a high-degree node tends to connect with nodes 
of high (low) degree. Protein interactions and GRNs in 
yeast and the neural network in nematodes are known 
to be disassortative [49,50]. Assortative networks have 
higher thresholds for perturbation to change the out- 
come and are thus more robust, whereas disassortative 
networks are thought to be resilient against small per- 
turbations [49,51]. Assortativity was calculated as the 
Pearson correlation coefficient of the degree of nodes 
at the ends of the ith. edge in the core gene network 
as follows: 



assortativity = 



^ IZiecore h^i [ ^jecore h + ] ^ 
J2iecoreJi + — [ J^iecore + ^/l^ 

(11) 



Positive edge ratio 

Whether the expression of a gene was activated or 
repressed by the binding of a transcription factor was 
determined by the signs of their interaction coefficients. 
The expected value of the ratio of positive edges to nega- 
tive edges was dependent on the bias parameter A: how- 
ever, it randomly varied during the initial state and was 
changed by the evolution during the simulation. We mea- 
sured the realized ratio of positive edges to negative edges 
at the end of the simulation. 

Analysis of factors affecting network properties 

Network properties evolved during the simulations. We 
examined the effects of the number of genes, number of 
c/5-regulatory elements, mutation rate, population size, 
strength of stabilizing selection, and number of environ- 
ments on the properties of the evolved networks by alter- 
ing these parameter values. Simulations were conducted 
20 times for each parameter, and the average values of 
the network properties were measured thereafter. A GLM 



Iwasaki etal. BMC Evolutionary Biology 20^ 3, 13:91 
http://www.biomedcentral.conn/1471-2148/13/91 



Page 7 of 14 



was then created with each parameter as a response vari- 
able and the average network property as an explanatory 
variable. 

Results 

Cryptic variations in GRNs 

In the default parameter set (Table 1), significantly more 
phenotypic variations appeared in response to a novel 
environmental change {Pnovel) compared with variations 
observed under the normal condition {Pnormal) during sta- 
bilizing selection over 20,000 generations (Figure ?,)V = 
5825, P = 4.298 x 10"^^). In the present simulations, phe- 
notypic differences were caused by at least one genetic 
difference in the GRNs; therefore, the phenotypic varia- 
tions were partly due to the total genetic variation. Thus, 
Pnovel > Pnormal Hieant that genetic variations that had 
no influence on phenotypic variations when responding 
to normal environmental stimuli were accumulated and 
these produced phenotypic variations when responding to 
novel environmental stimuli in the population of GRNs 

during stabilizing selection (i.e., Pcryptic = Pnovel - Pnormal 

could be considered indicative of CGVs). 

The effect of GRN properties on the accumulation of 
cryptic variations 

The number of CGVs in each of populations varied 
greatly, even with the same parameter sets (Figure 3). 
This may be partly because the properties of the initial 
and subsequent GRNs varied owing to random factors. 



125- 




Normal Novel 
Environment 

Figure 3 Comparison of phenotypic diversity between normal 
and novel environments. Pnovel was significantly larger tlian Pnormal 
(Wilcoxon signed rank test, V = 5825, P = 4.298 x 10-^4)_ 



We thus examined the effects of the properties of variable 
GRNs in the same default parameter set on the number 
of CGVs (Pcryptic) were examined. Figure 4 shows the rela- 
tionships between the number of CGVs and the properties 
of the GRNs (the network order, size, weighted size, den- 
sity, clustering coefficient, number of self-loops, degree 
assortativity, and number of genetic variations). Of these 
network properties, only weighted size and genetic varia- 
tions were selected as variables that explained the number 
of CGVs. The number of CGVs increased along with the 
increases in the weighted size of the GRNs {t = 2.463, 
P = 0.0153) and in the number of genetic polymorphisms 
in the populations {t = 1.908, P = 0.0590). The model 
selection statistics for the effect of GRN features on the 
number of CGVs {Pcryptic) are shown in Additional file 1: 
(Table SI). 

Effects of population size, mutation rate, number of genes, 
number of c/s-regulatory elements, and positive edge bias 
on cryptic variations 

The effects of the population size (A/"), mutation rates (/x), 
number of genes {Vgenome)> number of c/5-elements (L), 
and positive edge bias {A) on genetic variations main- 
tained in the populations (G), visible phenotypic varia- 
tion {Pnormal) > ^nd accumulation of CGVs {Pcryptic) were 
examined by altering these parameter values. Table 2 
shows the regression coefficients and statistical signif- 
icance of these effects after adjustment for multiple 
comparisons [52]. G, Pnormah and Pcryptic in the popu- 
lations were significantly increased with increases in Ny 
/jL, and Vgenome (Figure 5A). G decreased with increasing 
numbers of c/5-elements {L) under a constant mutation 
rate per individual, whereas visible and cryptic pheno- 
typic variations {Pnormal Pcryptic) were not affected 
(Figure 5B). On the other hand, positive edge bias 
{A) caused a slight increase in the number of CGVs 
{Pcryptic) Figure 5C) despite the reduced number of genetic 
polymorphisms (G). 

Effects of selective pressure and the number of 
within-generation environments on the accumulation of 
cryptic variations 

The effects of parameters related to environmental 
factors, the strength of stabilizing selection (5), and 
the number of environments that individuals expe- 
rienced within their lifetimes {H) were examined 
(Table 2). Increased strength of the stabilizing selec- 
tion decreased G and Pnormal but did not affect Pcryptic 
(Figure 6A-C). The numbers of genetic variations (G) 
and cryptic variations {Pcryptic) significantly decreased 
with the number of environments that individuals 
experienced within their lifetimes, whereas the visi- 
ble phenotypic variations {Pnormal) were not affected 
(Figure 6D-F). 



Iwasaki etal. BMC Evolutionary Biology 20^ 3, 13:91 
http://www.bionnedcentral.conn/1471-2148/13/91 



Page 8 of 14 



80- 
UO- 
0 



10 12 14 16 18 

Order Vcore 



B 



80- 
UO- 
0-' 



10 20 30 40 

Size E 



80- 

; 40- 

0- 



••••• 

0 5 10 

Weighted Size Eweighted 

D 



80- 



>;40- 

o 

□_ 



0.0 0.1 0.2 0.3 0.4_ 

Clustering Coefficient C 



80- 



; 40- 



80-1 

u { 

40- 



0.08 0.10 0.12 0.14 

Density 



0 12 3 

Number of Self-loops 



G 



80- 
UO- 





• 

- . • 




• 


• 


• 

• 

••• 

• 



-0.4 -0.2 0.0 0.2 0.4 

Degree Assortativity 

H 



80- 
; 40- 
0- 







• 

• 


• 



800 1200 1600 2000 2400 

G 



Figure 4 Relationships between cryptic variations and network properties under the default parameter condition. The weighted size of the 
networl<, genetic variations, and degree assortativity were selected as the variables that explained the number of cryptic variations using GLM 
model selection. Regression lines were drawn on the basis of simple linear regression analysis. 



Table 2 The effects of each parameter on population variation 



Response variable 


Explanatory variable 


Estimate 


Observed P 


Rank of P 


Adjusted a 


G 


N 


0.0185 


5.479 X 10-25* 


4 


0.00952 


P normal 


N 


9.921 X 10-5 


0.000401* 


11 


0.0262 


P cryptic 


N 


0.000249 


0.00253* 


12 


0.0286 


G 


/X 


1.389 X 10^ 


1.608 X 10-22* 


5 


0.0119 


P normal 


/X 


1 .706 X 1 0^ 


2.936 X 10-9* 


8 


0.019 


P cryptic 


/X 


1.421 X 10^ 


0.0327* 


15 


0.0357 


G 


^genome 


117 


1.371 X 10-37* 


1 


0.00238 


P normal 


^genome 


0.914 


4.877 X 10-"* 


7 


0.0167 


P cryptic 


^genome 


1.04 


0.0132* 


14 


0.0333 


G 


L 


-21.1 


5.302 X 10-5* 


10 


0.0238 


P normal 


L 


0.23 


0.0959 


17 


0.0405 


P cryptic 


L 


-0.589 


0.15 


19 


0.0452 


G 


A 


-720 


1.900 X 10-5* 


9 


0.0214 


P normal 


A 


5.52 


0.154 


20 


0.0476 


P cryptic 


A 


20.6 


0.124 


18 


0.0429 


G 


S 


-294 


2.105 X 10-28* 


3 


0.00714 


P normal 


S 


-12.6 


5.995 X 10-32* 


2 


0.00476 


P cryptic 


S 


0.494 


0.792 


21 


0.05 


G 


H 


-239 


3.709 X 10-22* 


6 


0.0143 


P normal 


H 


1.05 


0.0443 


16 


0.0381 


P cryptic 


H 


-4.47 


0.00687* 


13 


0.031 



The effects of population size (A/), mutation rate (/x), number of genes (Vgenome), number of c/s-elements (/.), positive edge bias [A), strength of selection (5), and 
number of environments (H) on genetic variations maintained In the populations (G), visible phenotypic variation (Pnormai), and the accumulation of cryptic variations 
iPcryptic) are shown. The a levels were adjusted to control for multiple comparisons using the false discovery rate (FDR) procedure. 
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Effects of altered parameters on network characteristics 

The effects of varying the parameter values on the net- 
work properties of evolved populations were also analyzed 
Additional file 2: (Table S2), As expected, the mutation 
rate (/x) and population size {N) did not affect the network 
properties of the evolved populations. The network size 
significantly increased with increasing numbers of genes 
{Vgenome) and c/5-regulatory elements (L). The increasing 
number of genes (Vgenome) caused an increase in network 
order and a decrease in density, whereas the increasing 
number of c/5-regulatory elements (L) caused an increase 
in network density. 

The strength of stabilizing selection (S) did not affect the 
network properties of the evolved populations (Figure 7A- 
C). The order, size, and density of the evolved GRNs were 
significantly increased with exposure to increasing num- 
bers of environments within a lifetime {H; Figure 7D-F). 
In other words, larger GRNs with higher connectivity 
tended to evolve in heterogeneous environments. 

Discussion 

Factors affecting the number of cryptic variations 

The accumulation and release of CGVs were predicted to 
be the outcome of epistasis and genotype-environment 
interactions by analytical population genetic modeling 
[19]. Hermisson and Wagner (2004) assumed sexual 
reproduction and argued that the sex dependence of allelic 
effects is an important mechanism for conditional neu- 
trality. In the context of a recent categorization [16], their 
model corresponds to "post-mutation evolvability under 
high recombination," whereas ours corresponds to "pre- 
mutation evolvability under low recombination." Here we 
showed that such epistatic behavior could be produced 
as a generic feature of GRNs even without sexual repro- 
duction and without an explicit coefficient for hidden 
variation as expected from previous stochastic models 
[29,34,53-56]. The present study used individual-based 



models with explicit implementation of a population of 
GRNs that produce phenotypes responding to environ- 
mental stimuli, and our findings indicate that the cryptic 
phenotypic variations unveiled by a novel environmental 
stimulus can accumulate and be maintained in evolved 
populations. The signals induced by a novel environmen- 
tal stimulus probably initiated different cascades of gene 
interactions, which led to novel phenotypic variations due 
to standing genetic variations that were invisible under 
normal circumstances. 

In the present study, we used two methods to exam- 
ine the effects of factors that affect the number of CGVs. 
First, because genetic variations and some network prop- 
erties varied among the simulation replicates for the same 
parameter set values, we examined the effects of varying 
values of these properties. Second, we altered the param- 
eter values and examined the effects of varying parameter 
values on the CGVs. In our simulations, the number of 
genes, Vgenome> was not assumed to be altered during 
the evolutionary process; therefore, the effect of Vgenome 
could be examined by altering the initial values of the 
parameters. 

The number of CGVs accumulated in each of the pop- 
ulations increased with increases in the total numbers 
of genetic polymorphisms (Figure 4H) when the simula- 
tion replicates with the default parameters were analyzed. 
This is reasonable because the more polymorphic geno- 
types a population has, the more diverse phenotypes it is 
supposed to produce. This finding was also supported by 
the results of the simulations where only the population 
size and mutation rate were altered (Table 2). In addition, 
increasing the network order by altering the number of 
genes increased the number of total and cryptic genetic 
variations even under conditions where the mutation rate 
per individual (not per gene) was kept constant. 

Under the condition of a fixed number of genes, the 
number of CGVs increased with increasing the weighted 
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Figure 6 Relationships between variations in a population and environmental parameters. Simulations were repeated 20 times for eacli 
parameter value, and each dot represents a simulation run. Regression lines were drawn on the basis of the linear regression statistics shown in 
Table 2; solid and dashed lines denote significant and non-significant correlations, respectively. 



size of the GRNs (Figure 4C). The weighted size of a GRN 
was expected to increase when increasing the order, size, 
and ratio of the positive edges. In the second analysis, 
increasing the number of genes caused increases in the 
order and size of the GRNs followed by an increase in 
the number of CGVs. In addition, it was confirmed that 
increasing the ratio of positive edges by altering the pos- 
itive edge bias caused increases in the weighted size and 
the number of CGVs in the GRNs. The genetic differ- 
ence among networks with large weighted sizes may have 
been amplified through genetic interactions and resulted 
in larger differences in expression outcomes, which would 
have contributed to the emergence of CGVs. 



These findings indicate that the number of genes could 
affect the number of CGVs in two ways. First, a population 
of GRNs with a large number of genes would be able to 
accumulate a larger number of genetic variations. Second, 
a higher network order would have a larger weighted size, 
which may contribute to the emergence of more pheno- 
typic variations. Thus, we propose that the network order 
(i.e., the number of genes) is the most important factor in 
facilitating the accumulation and release of CGVs. 

Effect of strength of selection and environmental variations 

The accumulation of CGVs was affected by the strength 
of the selective pressure and the number of environments 
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Figure 7 Relationships between GRN features and environmental parameters. Simulations were repeated 20 times for eacli parameter value, 
and each dot represents a population mean of a simulation run. Regression lines were drawn on the basis of the linear regression statistics shown in 
Table S2; solid and dashed lines denote significant and non-significant correlations, respectively. 



that the individuals experienced during their Ufetime. 
Increasing the strength of selection and the number of 
environments decreased the genetic variation to a simi- 
lar extent; however, the effects of these two environmental 
factors on the number of CGVs differed. This indicates 
that the quality of accumulated genetic variations may 
differ depending on the ancestral environments that the 
populations experienced through the generations. 

We predicted that strong selection usually purges 
genetic variation within populations; therefore, increasing 
the strength of stabilizing selection decreases the number 
of genetic and phenotypic variations. The present find- 
ings showed that this prediction was only true for visible 
variations and not for cryptic variations. The number of 
CGVs increased despite the strong selective pressure. Not 



all mutations in the GRNs led to phenotypic differences 
in a normal environment. The mutations that caused 
phenotypically visible differences were purged by strong 
pressure, whereas those that did not affect the phenotype 
were free from selection and accumulated in populations 
through mutation-drift balance. Therefore, the genetic 
variations that did not contribute to the phenotypes in the 
normal environments were preserved in the populations 
regardless of the strength of the selective pressure. 

The number of environments an individual was exposed 
to during a lifetime also decreased the accumulation of 
genetic variations. Here, the individuals were required 
to express different phenotypes in response to differ- 
ent environmental signals experienced during their life- 
times using appropriate regulatory pathways. When an 
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individual experienced a larger number of environments, 
more of the GRN should be involved in producing phe- 
notypic differences than that in normal environments. 
Therefore, exposure to a larger number of different envi- 
ronments would probably decrease the number of muta- 
tions accumulated through genetic drift. In other words, 
populations with high plasticity may have less evolvabil- 
ity than those with less plasticity, which would result in 
less phenotypic diversity in response to novel environ- 
ments. An experimental study indicated that populations 
that evolved in rapidly changing environments may result 
in genotypes with high levels of pleiotropy and histor- 
ical constraints, which in tern could impede the future 
evolution [57]. This finding indicates that there may be 
intrinsic genetic costs associated with adaptation to vari- 
able environments [58-60]. These costs differ from other 
types of costs associated with plasticity because they can- 
not be directly detected by negative correlations between 
individual fitness and plasticity but can instead emerge as 
reduced genetic variability on an evolutionary timescale. 
However, some predictions [54-56] contradict this. One 
possible reason for this discrepancy is a difference in 
focus. Espinosa-Soto et al. [54] showed that perturbing 
the initial state of each GRN every generation shortened 
the time required to discover the genotype of an arbitrary 
novel phenotype. Fierst [55] measured the evolutionary 
variability of genetic systems toward novel phenotypic 
optima without changing the environmental cue. Draghi 
and Whitlock [56] emphasized the directionality of plas- 
ticity and evolvability. Here we investigated the quantita- 
tive tendencies of genetic and phenotypic variations and 
detected a negative effect of environmental heterogeneity 
on the number of variations present in a population. 

On the other hand, environmental heterogeneity would 
be expected to have positive effects on the number of 
CGVs in the light of GRN evolution. The more hetero- 
geneous the environment that individuals experienced 
during their lifetimes, the larger and denser the GRNs 
tended to become Additional file 2: (Table S2). This was 
probably because only small parts of the GRNs were 
used in low heterogeneity environments and then unused 
genes were then allowed to mutate and become discon- 
nected from the GRNs without affecting their fitness. In 
other words, populations that evolve in static environ- 
ments over long periods may lose the ability to sense 
and respond to the environment, and would therefore 
show less revealed phenotypic variations in novel envi- 
ronments. A previous theoretical study [36] showed that 
the evolution of complex GRNs could be remarkably 
promoted by the fixation of beneficial gene duplica- 
tions under randomly fluctuating environmental condi- 
tions and that such GRNs tend to exhibit high mutational 
robustness and evolvability. Furthermore, they showed 
that large and complex GRNs could not be evolved in 



cyclically fluctuating environments. This indicates that 
predictable variable environments, including the hetero- 
geneous environments that we used here, cannot pro- 
mote the evolution of higher evolvability as well as of 
larger numbers of CGVs. Therefore, interactions with 
variable environments in which higher plasticity has 
evolved may promote the accumulation of CGVs by facil- 
itating the evolution of larger GRNs in some cases. In 
turn, the expansion of GRNs could facilitate evolutionary 
adaptation to novel environments and niche construction. 
Thus, we propose that the interaction between GRNs and 
variable environments could be a sustainable source of 
evolvability. 

Acquisition of novel traits promoted by cryptic genetic 
variations in GRNs 

It is believed that CGVs contribute to evolutionary 
responses to environmental changes and to long-term 
selection by providing phenotypic variations in response 
to changes in the background [8,9,61]. Furthermore, some 
researchers argue that CGVs also promote the acquisition 
of novel traits because of their ability to accumulate and 
release multiple mutations in individuals and populations 
[62,63]. 

Several studies support these arguments. For exam- 
ple, the black mutant strain of Manduca sexta, which 
was originally green, showed variations in thermosensi- 
tivity where some larvae turned green and the others 
stayed black in response to heat shock over a specific 
period [64]. The authors of that study also artificially 
selected these variations and successfully established two 
lines: one for sensitivity and the other for insensitivity. 
CGVs that accumulate during strong stabilizing selec- 
tion could contribute to phenotypic evolution in this 
manner. 

In addition, a previous model study [34] predicted that 
GRNs are robust and evolvable enough to cover the broad 
genotypic space and to reach genotypes that produce 
novel phenotypes through recurrent neutral mutations. 
Our individual-based model revealed that CGVs could 
accumulate and produce phenotypic diversity to con- 
tribute to evolvability in the context of population genetic 
processes. It has been proposed that novel phenotypes do 
not necessarily require new genes. Rather, changes in the 
expression patterns of existing genes are important for 
phenotypic evolution [65-68]. Further simulations with 
properly defined novel phenotypes will help elucidate the 
evolution of novelty. 

Conclusions 

We constructed an individual-based model of GRNs that 
controlled gene expression in response to environmen- 
tal stimuli. The model facilitated the analysis of network 
properties in the context of population genetics. It showed 
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that populations of GRNs accumulate and release cryp- 
tic variations, the number of which varies depending on 
the properties of the GRNs and the environments to 
which they have been subjected across the generations. 
Our findings indicate that the expansion of GRNs and 
adaptation to novel environments are mutually facilitat- 
ing, resulting in a sustainable sources of evolvability. This 
study thus provides important insight into the origins of 
biological diversity and complexity. 

Additional files 



Additional file 1: Table SI. Statistics for nested model comparisons. 
Model selection statistics for the effects of GRN features on the number of 
cryptic genetic variations in a population. Among all possible 
combinations, only the models for which the AlC values differed from that 
of the null model by > 2 are shown. Note that all these models include the 
weighted size of the GRN {Eweighted) as an explanatory variable. 
Additional file 2: Table S2. The effects of each parameter on the 
evolution of GRNs. The effects of population size (A/), mutation rate {fx), 
number of genes {Vgenome), number of c/'s-elements (/_), positive edge bias 
{A), strength of selection (5), and number of environments (H) on network 
properties are shown. The a levels were adjusted to control for multiple 
comparisons using the false discovery rate (FDR) procedure. 
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